69 research outputs found

    Low-Weight Primes for Lightweight Elliptic Curve Cryptography on 8-bit AVR Processors

    Get PDF
    Small 8-bit RISC processors and micro-controllers based on the AVR instruction set architecture are widely used in the embedded domain with applications ranging from smartcards over control systems to wireless sensor nodes. Many of these applications require asymmetric encryption or authentication, which has spurred a body of research into implementation aspects of Elliptic Curve Cryptography (ECC) on the AVR platform. In this paper, we study the suitability of a special class of finite fields, the so-called Optimal Prime Fields (OPFs), for a "lightweight" implementation of ECC with a view towards high performance and security. An OPF is a finite field Fp defined by a prime of the form p = u*2^k + v, whereby both u and v are "small" (in relation to 2^k) so that they fit into one or two registers of an AVR processor. OPFs have a low Hamming weight, which allows for a very efficient implementation of the modular reduction since only the non-zero words of p need to be processed. We describe a special variant of Montgomery multiplication for OPFs that does not execute any input-dependent conditional statements (e.g. branch instructions) and is, hence, resistant against certain side-channel attacks. When executed on an Atmel ATmega processor, a multiplication in a 160-bit OPF takes just 3237 cycles, which compares favorably with other implementations of 160-bit modular multiplication on an 8-bit processor. We also describe a performance-optimized and a security-optimized implementation of elliptic curve scalar multiplication over OPFs. The former uses a GLV curve and executes in 4.19M cycles (over a 160-bit OPF), while the latter is based on a Montgomery curve and has an execution time of approximately 5.93M cycles. Both results improve the state-of-the-art in lightweight ECC on 8-bit processors

    MoTE-ECC: Energy-Scalable Elliptic Curve Cryptography for Wireless Sensor Networks

    Get PDF
    Wireless Sensor Networks (WSNs) are susceptible to a wide range of malicious attacks, which has stimulated a body of research on "light-weight" security protocols and cryptographic primitives that are suitable for resource-restricted sensor nodes. In this paper we introduce MoTE-ECC, a highly optimized yet scalable ECC library for Memsic's MICAz motes and other sensor nodes equipped with an 8-bit AVR processor. MoTE-ECC supports scalar multiplication on Montgomery and twisted Edwards curves over Optimal Prime Fields (OPFs) of variable size, e.g. 160, 192, 224, and 256 bits, which allows for various trade-offs between security and execution time (resp. energy consumption). OPFs are a special family of "low-weight" prime fields that, in contrast to the NIST-specified fields, facilitate a parameterized implementation of the modular arithmetic so that one and the same software function can be used for operands of different length. To demonstrate the performance of MoTE-ECC, we take (ephemeral) ECDH key exchange between two nodes as example, which requires each node to execute two scalar multiplications. The first scalar multiplication is performed on a fixed base point (to generate a key pair), whereas the second scalar multiplication gets an arbitrary point as input. Our implementation uses a fixed-base comb method on a twisted Edwards curve for the former and a simple ladder approach on a birationally-equivalent Montgomery curve for the latter. Both scalar multiplications require about 9*10^6 clock cycles in total and occupy only 380 bytes in RAM when the underlying OPF has a length of 160 bits. We also describe our efforts to harden MoTE-ECC against side-channel attacks (e.g. simple power analysis) and introduce a highly regular implementation of the comb method

    Time efficient dual-field unit for cryptography-related processing

    No full text
    [ITALIAN VERSION, English version below] Il lavoro si colloca nell'ambito dell'aritmetica dei calcolatori. In particolare, esso tratta la moltiplicazione modulare, un'operazione centrale in diverse aree quali i codici per il controllo degli errori e l'implementazione di algoritmi crittografici. In effetti, gli onerosi algoritmi di crittografia a chiave pubblica, quali i crittosistemi basati sull’algoritmo Rivest-Shamir-Adleman (RSA) e su Curve Ellittiche (EC), sono profondamente influenzati nelle loro prestazioni dalla moltiplicazione modulare. In crittografia, tale operazione può essere effettuata in due differenti strutture algebriche, precisamente i campi finiti GF(N) e GF(2^m), che normalmente richiedono soluzioni hardware distinte per velocizzare i calcoli. Il prodotto di Montgomery rappresenta la soluzione maggiormente adottata in quanto essa consente un'efficiente implementazione hardware, purché si adotti una definizione leggermente modificata di prodotto modulare. In questo lavoro, si propone una innovativa architettura unificata per il prodotto di Montgomery parallelo che consenta operazioni sia nel campo finito GF(N) che in GF(2^m), critici per i crittosistemi a chiave pubblica RSA ed ECC. Lo schema hardware intercala moltiplicazione e riduzione modulare. Inoltre, esso manipola il moltiplicando attraverso uno schema modificato di ricodifica di Booth, ed adotta uno approccio radix-4 per il modulo, consentendo tempi di esecuzione ridotti anche per dimensioni degli operandi ragionevolmente grandi. Si presenta inoltre nell’articolo un'architettura pipelined basata sui blocchi paralleli precedentemente introdotti, che richiede un numero di colpi di clock estremamente ridotto ed alti livelli di throughput per gli operandi lunghi tipicamente adoperati in applicazioni crittografiche. Una serie di risultati sperimentali, basati su una tecnologia CMOS a 0,18 µm, dimostra l'efficacia delle tecniche proposte superando i migliori risultati presentati precedentemente in letteratura. [ENGLISH VERSION] This work deals with computer arithmetic. Specifically, it focuses on modular multiplication, a central operation in several areas such as error control codes and cryptographic computing. In fact, computational demanding public key cryptographic algorithms, such as Rivest-Shamir-Adleman (RSA) and Elliptic Curve (EC) cryptosystems, are deeply affected by modular multiplication for their performance. Modular multiplication used in cryptography may be performed in two different algebraic structures, namely GF(N) and GF(2^n), which normally require distinct hardware solutions for speeding up performance. Montgomery multiplication is the most widely adopted solution, as it enables efficient hardware implementations, provided that a slightly modified definition of modular multiplication is adopted. In the paper, we propose a novel unified architecture for parallel Montgomery multiplication supporting both GF(N) and GF(2^n) finite field operations, which are critical for RSA ad ECC public key cryptosystems. The hardware scheme interleaves multiplication and modulo reduction. Furthermore, it relies on a modified Booth recoding scheme for the multiplicand and a radix-4 scheme for the modulus, enabling reduced time delays even for moderately large operand widths. In addition, we present a pipelined architecture based on the parallel blocks previously introduced, enabling very low clock counts and high throughput levels for long operands used in cryptographic applications. Experimental results, based on 0.18 µm CMOS technology, prove the effectiveness of the proposed techniques, and outperform the best results previously presented in the technical literature

    A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography Over Binary Finite Fields GF(2m)

    No full text
    Abstract. Mobile and wireless devices like cell phones and networkenhanced PDAs have become increasingly popular in recent years. The security of data transmitted via these devices is a topic of growing importance and methods of public-key cryptography are able to satisfy this need. Elliptic curve cryptography (ECC) is especially attractive for devices which have restrictions in terms of computing power and energy supply. The efficiency of ECC implementations is highly dependent on the performance of arithmetic operations in the underlying finite field. This work presents a simple architectural enhancement to a generalpurpose processor core which facilitates arithmetic operations in binary finite fields GF(2 m). A custom instruction for a multiply step for binary polynomials has been integrated into a SPARC V8 core, which subsequently served to compare the merits of the enhancement for two different ECC implementations. One was tailored to the use of GF(2 191) with a fixed reduction polynomial. The tailored implementation was sped up by 90 % and its code size was reduced. The second implementation worked for arbitrary binary fields with a range of reduction polynomials. The flexible implementation was accelerated by a factor of nearly 10

    Efficient Implementation of NIST-Compliant Elliptic Curve Cryptography for Sensor Nodes

    Get PDF
    In this paper, we present a highly-optimized implementation of standards-compliant Elliptic Curve Cryptography (ECC) for wireless sensor nodes and similar devices featuring an 8-bit AVR processor. The field arithmetic is written in Assembly language and optimized for the 192-bit NIST-specified prime p = 2^192 - 2^64 - 1, while the group arithmetic (i.e. point addition and doubling) is programmed in ANSI C. One of our contributions is a novel lazy doubling method for multi-precision squaring which provides better performance than any of the previously-proposed squaring techniques. Based on our highly optimized arithmetic library for the 192-bit NIST prime, we achieve record-setting execution times for scalar multiplication (with both fixed and arbitrary points) as well as multiple scalar multiplication. Experimental results, obtained on an AVR ATmega128 processor, show that the two scalar multiplications of ephemeral Elliptic Curve Diffie-Hellman (ECDH) key exchange can be executed in 1.75 s altogether (at a clock frequency of 7.37 MHz) and consume an energy of some 42 mJ. The generation and verification of an ECDSA signature requires roughly 1.91 s and costs 46 mJ at the same clock frequency. Our results significantly improve the state-of-the-art in ECDH and ECDSA computation on the P-192 curve, outperforming the previous best implementations in the literature by a factor of 1.35 and 2.33, respectively. We also protected the field arithmetic and algorithms for scalar multiplication against side-channel attacks, especially Simple Power Analysis (SPA)
    corecore